<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang=en>
<head>
<title>ESRI Shapefile</title>
</head>

<body>

<h1>ESRI Shapefile</h1>

<p>All varieties of ESRI Shapefiles should be available for reading, and
simple 3D files can be created.</p>

<p>Normally the OGR Shapefile driver treats a whole directory of shapefiles
as a dataset, and a single shapefile within that directory as a layer.  In
this case the directory name should be used as the dataset name.  However,
it is also possible to use one of the files (.shp, .shx or .dbf) in a
shapefile set as the dataset name, and then it will be treated as a dataset
with one layer.</p>

<p>Note that when reading a Shapefile of type SHPT_ARC, the corresponding layer
will be reported as of type wkbLineString, but depending on the number of
parts of each geometry, the actual type of the geometry for each feature
can be either OGRLineString or OGRMultiLineString.
The same applies for SHPT_POLYGON shapefiles, reported as layers of type
wkbPolygon, but depending on the number of parts of each geometry, the
actual type can be either OGRPolygon or OGRMultiPolygon.</p>

<p>ESRI measure values (XYM) are read as XYZ geometries. MultiPatch
files are read and each patch geometry is turned into a multi-polygon 
representation with one polygon per triangle in triangle fans and meshes.</p>

<p>If a .prj files in old Arc/Info style or new ESRI OGC WKT style is present,
it will be read and used to associate a projection with features.</p>

<p>The read driver assumes that multipart polygons follow the specification, 
that is to say the vertices of outer rings should be oriented clockwise on the 
X/Y plane, and those of inner rings counterclockwise.
If a Shapefile is broken w.r.t. that rule, it is possible to define the 
configuration option OGR_ORGANIZE_POLYGONS=DEFAULT to proceed to a full 
analysis based on topological relationships of the parts of the polygons so 
that the resulting polygons are correctly
defined in the OGC Simple Feature convention. </p>

<p>An attempt is made to read the LDID/codepage setting from the .dbf file 
and use it to translate string fields to UTF-8 on read, and back when writing.
LDID "87 / 0x57" is treated as ISO8859_1 which may not be appropriate.  
The SHAPE_ENCODING <a href="http://trac.osgeo.org/gdal/wiki/ConfigOptions">
configuration option</a> may be used to override the encoding interpretation
of the shapefile with any encoding supported by CPLRecode or to "" to avoid
any recoding. (Recoding support is new for GDAL/OGR 1.9.0)</p>

<h2>Spatial and Attribute Indexing</h2>

<p>The OGR Shapefile driver supports spatial indexing and a limited form of
attribute indexing.</p>

<p>The spatial indexing uses the same .qix quadtree spatial index files that
are used by UMN MapServer. Starting with OGR 1.10, it can also use the ESRI spatial index
files (.sbn / .sbx).  Spatial indexing can accelerate spatially filtered
passes through large datasets to pick out a small area quite dramatically.</p>

<p>To create a spatial index, issue a SQL command of the form</p>
<pre>CREATE SPATIAL INDEX ON tablename [DEPTH N]</pre>
<p>where optional DEPTH specifier can be used to control number of index tree levels
generated. If DEPTH is omitted, tree depth is estimated on basis of number of features
in a shapefile and its value ranges from 1 to 12.</p>

<p>To delete a spatial index issue a command of the form</p>
<pre>DROP SPATIAL INDEX ON tablename</pre>


<p>
  Otherwise, the <a href="http://mapserver.org">MapServer</a> shptree utility can be used:</p>
<pre>shptree &lt;shpfile&gt; [&lt;depth&gt;] [&lt;index_format&gt;]</pre>

<p>
More information is available about this utility at the
<a href="http://mapserver.org/utilities/shptree.html">MapServer shptree page</a>
</p>

<p>Currently the OGR Shapefile driver only supports attribute indexes for 
looking up specific values in a unique key column.  To create an attribute
index for a column issue an SQL command of the form "CREATE INDEX ON tablename
USING fieldname".  To drop the attribute indexes issue a command of the
form "DROP INDEX ON tablename".  The attribute index will accelerate
WHERE clause searches of the form "fieldname = value".  The attribute
index is actually stored as a mapinfo format index and is not compatible
with any other shapefile applications.</p>

<h2>Creation Issues</h2>

<p>The Shapefile driver treats a directory as a dataset, and each Shapefile
set (.shp, .shx, and .dbf) as a layer.  The dataset name will be treated
as a directory name.  If the directory already exists it is used and 
existing files in the directory are ignored.  If the directory does not
exist it will be created.</p>

<p>As a special case attempts to create a new dataset with the extension .shp
will result in a single file set being created instead of a directory.</p>

<p>ESRI shapefiles can only store one kind of geometry per layer (shapefile). 
On creation this is may be set based on the source file (if a uniform geometry
type is known from the source driver), or it may be set directly by the
user with the layer creation option SHPT (shown below).  If not set the
layer creation will fail.  If geometries of incompatible types are written
to the layer, the output will be terminated with an error.</p>

<p>Note that this can make it very difficult to translate a mixed geometry layer
from another format into Shapefile format using ogr2ogr, since ogr2ogr has
no support for separating out geometries from a source layer. See 
the <a href="http://trac.osgeo.org/gdal/wiki/FAQVector#HowdoItranslateamixedgeometryfiletoshapefileformat">FAQ</a> for a solution.</p>

<p>Shapefile feature attributes are stored in an associated .dbf file, and so
attributes suffer a number of limitations:</p>

<ul>
<li><p> Attribute names can only be up to 10 characters long. Starting with version 1.7,
the OGR Shapefile driver tries to generate
unique field names. Successive duplicate field names, including those created
by truncation to 10 characters, will be truncated to 8 characters and appended
with a serial number from 1 to 99. </p><p>For example:</p>
<ul><li>a &rarr; a, a &rarr; a_1, A &rarr; A_2;</li>
<li>abcdefghijk &rarr; abcdefghij, abcdefghijkl &rarr; abcdefgh_1</li></ul>
</li>

<li><p> Only Integer, Real, String and Date (not DateTime, just year/month/day)
field types are supported.  The various list, and binary field types cannot
be created.</p></li>

<li><p> The field width and precision are directly used to establish storage 
size in the .dbf file.  This means that strings longer than the field
width, or numbers that don't fit into the indicated field format will suffer
truncation.</p></li>

<li><p> Integer fields without an explicit width are treated as width 11.</p></li>

<li><p> Real (floating point) fields without an explicit width are treated as
width 24 with 15 decimal places of precision.</p></li>

<li><p> String fields without an assigned width are treated as 80 characters.</p></li>

</ul>

<p>Also, .dbf files are required to have at least one field.  If none are created
by the application an "FID" field will be automatically created and populated
with the record number.</p>

<p>The OGR shapefile driver supports rewriting existing shapes in a shapefile
as well as deleting shapes.  Deleted shapes are marked for deletion in 
the .dbf file, and then ignored by OGR.  To actually remove them permanently
(resulting in renumbering of FIDs) invoke the SQL 'REPACK &lt;tablename&gt;' via
the datasource ExecuteSQL() method.</p>

<h2>Field sizes</h2>

<p>Starting with GDAL/OGR 1.10, the driver knows to auto-extend string and integer fields
(up to the 255 bytes limit imposed by the DBF format) to dynamically accomodate for
the length of the data to be inserted.</p>

<p>It is also possible to force a resize of the fields to the optimal width by issuing a
SQL 'RESIZE &lt;tablename&gt;' via the datasource ExecuteSQL() method. This is convenient
in situations where the default column width (80 characters for a string field) is bigger than
necessary.</p>

<h2>Spatial extent</h2>

<p>Shapefiles store the layer spatial extent in the .SHP file. The layer spatial extent
is automatically updated when inserting a new feature in a shapefile. However when
updating an existing feature, if its previous shape was touching the bounding box of the
layer extent but the updated shape does not touch the new extent, the computed extent
will not be correct. It is then necessary to force a recomputation by invoking the
SQL 'RECOMPUTE EXTENT ON &lt;tablename&gt;' via the datasource ExecuteSQL() method. The
same applies for the deletion of a shape.</p>

<p>Note: RECOMPUTE EXTENT ON is available in OGR >= 1.9.0.</p>

<h2>Size Issues</h2>

<p>Geometry: The Shapefile format explicitly uses 32bit offsets and so cannot
go over 8GB (it actually uses 32bit offsets to 16bit words).
Hence, it is is not recommended to use a file size over 4GB.</p>

<p>Attributes: The dbf format does not have any offsets in it, so it can be
arbitrarily large.</p>

<h3>Dataset Creation Options</h3>

<p>None</p>

<h3>Layer Creation Options</h3>

<ul>
<li>
<b>SHPT=type</b>: Override the type of shapefile created.  Can be one of
NULL for a simple .dbf file with no .shp file, 
 POINT, ARC, POLYGON or MULTIPOINT for 2D, or 
POINTZ, ARCZ, POLYGONZ or MULTIPOINTZ for 3D.  Shapefiles with <i>measure</i>
values are not supported, nor are MULTIPATCH files.</li>

<li> <b>ENCODING=</b><i>value</i>: set the encoding value in the DBF file. The
default value is "LDID/87".  It is not clear what other values may be
appropriate.</li>

<li> <b>RESIZE=</b><i>yes/no</i>: (OGR &gt;= 1.10.0) set the yes to resize fields to their optimal
size. See above "Field sizes" section. Defaults to no.</li>
</ul>

<h3>VSI Virtual File System API support</h3>

The driver supports reading from files managed by VSI Virtual File System API, which include
"regular" files, as well as files in the /vsizip/, /vsigzip/ , /vsicurl/ domains.<p>

<h3>Examples</h3>

<ul>
<li>
<p>A merge of two shapefiles 'file1.shp' and 'file2.shp' into a new file
'file_merged.shp' is performed like this:</p>

<pre>
% ogr2ogr file_merged.shp file1.shp
% ogr2ogr -update -append file_merged.shp file2.shp -nln file_merged
</pre>

<p>The second command is opening file_merged.shp in update mode, and trying to
find existing layers and append the features being copied.</p>

<p>The -nln option sets the name of the layer to be copied to.</p>
</li>

<li>Building a spatial index :
<pre>
% ogrinfo file1.shp -sql "CREATE SPATIAL INDEX ON file1"
</pre>
</li>

<li>Resizing columns of a DBF file to their optimal size (OGR &gt;= 1.10.0) :
<pre>
% ogrinfo file1.dbf -sql "RESIZE file1"
</pre>
</li>


</ul>

<h3>See Also</h3>

<ul>
<li> <a href="http://shapelib.maptools.org/">Shapelib Page</a></li>
<li> <a href="http://trac.osgeo.org/gdal/wiki/UserDocs/Shapefiles">User Notes on OGR Shapefile Driver</a></li>
</ul>

</body>
</html>
